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Abstract 

This paper advocates use of nonparametric statistics. First, the consequence of 
using parametric inferential techniques under non-normality is described. Next, the 
advantages of using nonparametric techniques are presented. The third purpose is to 
demonstrate empirically how infrequently nonparametric statistics appear in studies, 
even those published in the most reputable journals. Fourth, a typology of 
nonparametric statistics is presented for all univariate GLM analyses. Fifth, the 
nonparametric statistics that are available in the most commonly used statistical 
software are delineated. Finally, nonparametric effect size indices are outlined. 
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A Call for Greater Use of Nonparametric Statistics 
Whether to choose a parametric or nonparametric statistic can be one of the 
most difficult steps in analyzing data. Many researchers struggle with this step, or just 
ignore this step, by proceeding on to use the more common parametric statistic. The 
process of checking assumptions in order to justify use of the parametric statistic, and 
being certain that the data fit the assumptions, is paramount and should be undertaken 
regularly (Kerlinger, 1964, 1973; Nunnally, 1975;Tukey, 1977). 

If the data violate the assumptions that justify use of the desired parametric 
statistic, then transformation of the data could be used that more adequately fits the 
assumption (Kirk, 1982). Indeed, a member of the family of Box-Cox transformations 
could be used (Box & Cox, 1964). For example, if the score distribution has moderate 
positive skew, then the square root transformation might be most appropriate; for 
severe positive skew a logarithm transformation might be useful; for moderate negative 
skew, reflecting the variable (i.e., subtracting each score from the largest score plus 
one) and then taking the square root might suffice; for severe negative skew, reflecting 
the variable and then taking the logarithm might be effective; finally, for J-shaped 
distributions, the inverse transformation might be the most adequate (Box & Cox, 1964; 
Bradley, 1982; Tabachnick& Fidell, 1996). 

Whatever, transformation is used, it is essential that the same assumptions are 
checked on the transformed data. Providing an appropriate transformation is selected, 
transforming the data can be an extremely useful method for dealing with outliers, as 
well as for deviations from the assumptions of normality, linearity, and homoscedasticity 
(i.e., variability of scores for one continuous variable being approximately the same at 
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all values of another continuous variable). However, the transformations can be 
problematic for two major reasons. First and foremost, it is not unusual that several 
transformations often must be attempted before the most appropriate one is found. This 
can be extremely time-consuming and frustrating for researchers working under tight 
deadlines for their research. Second, even when a suitable transformation is found, the 
subsequent inferential analysis is often more difficult to interpret than would have been 
the case if the variable had been analyzed in its original form. Thus, although some 
textbook authors (e.g., Tabachnick & Fidell, 1996) strongly recommend transforming 
data when assumptions are violated, very few researchers use this technique. In fact, 
the use of transformations typically is not considered important by instructors of 
graduate-level statistics and research methodology courses, nor is this topic even 
covered in these classes taught at the introductory level (Mundfrom, Shaw, Thomas, 
Young, & Moore, 1998). Such a lack of coverage likely leads to a lack of awareness of 
the potential usefulness of data transformations. 

Because of the lack of awareness of data transformations coupled with the 
problems described above when they are used, few researchers transform their data. 
Consistent with this assertion, Keselman et al. (1998), who examined articles published 
in 17 prominent educational and behavioral science research journals in the 1994 or 
1995, reported that data transformations were used in only 7.59% of articles involving 
between-subjects univariate designs (n = 79). Instead of using data transformations, 
some researchers decide to utilize the other option available for addressing assumption 
violations, namely, the nonparametric statistic (Gliner & Morgan, 2000; Hinkle, Wiersma, 
& Jurs, 1998; Newton & Rudestam, 1999). As stated by Hollander and Wolfe (1973, p. 




5 



Nonparametric Statistics 5 



1 ), nonparametric statistics represent a class of statistical methods that have specific 
“desirable properties that hold under relatively mild assumptions regarding the 
underlying populations(s) from which the data are obtained.” Hotelling and Pabst (1936) 
are credited for developing this field (Savage, 1953). 

Since the publication of the first textbook devoted exclusively to nonparametric 
procedures approximately one-half a century ago (Kendall, 1948), there has been a 
proliferation of textbooks dedicated to this topic. Yet, use of nonparametric statistics is 
extremely scant among researchers (Elmore & Woehlke, 1996; Jenkins, Fuqua, & 
Froehle, 1984). There are many possible reasons for this lack of use (Anderson, 1961 ; 
Blair, 1985). One reason stems from the fact that many researchers had graduate-level 
instruction in statistics that was taught in a rote manner. Another reason is that some 
researchers do not remember or do not know how to check their data for possible 
assumption violations (Sawilowsky, 1990). A third explanation for the lack of use of 
nonparametric statistics might arise from the fact that many graduate-level programs 
minimize students’ exposure to statistical content and methodology. This started during 
the 1970’s when nonparametric statistics were given a secondary role to parametric 
statistics in many textbooks (Sawilowsky, 1990; Winn & Johnson, 1978). Indeed, Aiken, 
West, Sechrest, and Reno (1990) reported that statistical and methodological curricula 
had advanced little in the previous 20 years. Thus, it is likely that many current 
researchers did not have much exposure to nonparametric statistics in their graduate 
courses. 

A fourth reason for the scarcity of use of nonparametric statistics is that many 
researchers believe parametric statistics are extremely robust to violations to data 
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assumptions (Boneau, 1960; Box, 1954; Glass, Peckman, & Sanders, 1972; Lindquist, 
1953). Kerlinger (1964, 1973) and Nunnally (1975) discussed the lack of use of 
nonparametric statistics as stemming from the assumption that nonparametric tests are 
less powerful than parametric statistics. Finally, the dearth of use of nonparametric 
methods might have arisen from a failure, fear, or even refusal to recognize that 
analytical techniques that were once popular no longer reflect best practices, and, 
moreover, may now be deemed inappropriate, misleading, invalid, or obsolete. 

Thus, this paper advocates use of nonparametric statistics. First, the role of 
statistical assumptions is described. Then, the consequence of using parametric 
inferential techniques under non-normality is presented. Next, the advantages of 
utilizing nonparametric techniques are presented. The third purpose is to demonstrate 
empirically how infrequently nonparametric statistics appear in studies, even those 
published in the most reputable journals. Fourth, a typology of nonparametric statistics 
is presented for all univariate GLM analyses. Fifth, the nonparametric statistics that are 
available in the most commonly used statistical software are delineated. Finally, 
nonparametric effect size indices are outlined. 

Nonparametric Statistics and the Role of Statistical Assumptions 

Most data in social science research fail to meet the assumptions for parametric 
statistics (Micceri, 1989). For these cases, if the data are not transformed, then 
nonparametric techniques should be utilized. To know when to use nonparametric 
statistics, a basic understanding of the role of statistical assumptions is necessary. 
Statistical assumptions can be thought of as “rules” or “guidelines” for a given statistic. 
Before a statistic is to be used, the assumptions for the statistic need to be checked to 
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see if they have been met. All univariate parametric analyses, including analyses of 
bivariate relationships, are subsumed by the general linear model (GLM), and are 
therefore bounded by its assumptions. An important assumption that prevails for all 
univariate GLM analyses is that the dependent variable is normally distributed. 

The more the normality assumption is violated, the less justified it is to rely on 
parametric statistics to conduct null hypothesis significance tests. However, many 
parametric statistics are assumed to be “robust” against reasonable violations of 
assumptions (Boneau, 1960, 1962; Box, 1953; Hinkle etal., 1998; Gardner, 1975; 
Minium, 1978; Newton & Rudestam, 1999). A statistical procedure or test is considered 
to be robust with respect to the particular underlying assumption, if it is reasonably 
insensitive to slight departures from the assumption (Hollander & Wolfe, 1973). If a 
parametric statistic is robust, then it can still be used when the assumption is not 
adequately met. Yet, Bradley (1978) and Singer (1979) contend that parametric 
statistics are not truly robust. Moreover, Bradley (1982) demonstrated that statistical 
inference becomes increasingly less robust as distributions depart from normality. 
Further, Tabachnick and Fidell (1996, p. 70) noted that “even when the statistics are 
used purely descriptively, normality, linearity, and homoscedasticity of variables 
enhance the analysis." In fact, according to some methodologists (e.g., Bradley, 1978; 
Singer, 1979), assumption violations are only tolerated by the overwhelming majority of 
researchers so that the parametric statistics can be used. 

Disturbingly, the majority of studies in the social and behavioral sciences do not 
utilize random samples (Shaver & Norton, 1980a, 1980b), even though “inferential 
statistics is based on the assumption of random sampling from populations” (Glass & 
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Hopkins, 1984, p. 177). In fact, randomness, in the form of random error, is the basis for 
sampling distributions against which observed findings are compared (Carver, 1993). 
Further, use of nonrandom samples increases the chances that scores will be non- 
normal. Another factor that contributes to violations of normality is that many data sets 
are generated from small samples. These problems render it likely that the underlying 
samples yield scores from dependent measures that depart from normality. Thus, it is 
not surprising that the majority of data in the social and behavioral sciences are not 
normally distributed (Micceri, 1989). 

When the dependent variable deviates from normality, the parametric GLM 
analysis should not be used. Bradley (1968) defined a nonparametric statistic as being 
a "distribution-free test . . .which makes no assumptions about the precise form of the 
sampled population” (p. 15). Alternatively stated, nonparametric methods are termed 
distribution-free because they can be employed for variables whose joint distribution 
represents any specified distribution, including the bivariate normal, or whose joint 
distribution is not known and therefore is unspecified (Gibbons, 1993). Therefore, when 
the assumption of normality is not met, a nonparametric statistic is the more appropriate 
choice. 

An important question to be asked is how much should scores deviate from 
normality before nonparametric statistics become essential. With regard to univariate 
inferential statistical techniques, Onwuegbuzie and Daniel (2002) have provided 
objective but simple criteria for determining whether scores deviate from normality. 
Specifically, these methodologists stated the following: 
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Additionally, for adequate sample sizes, a formal test of statistical significance 
can be conducted by utilizing the fact that the ratio of the skewness and kurtosis 
coefficients to their respective standard errors (i.e., standardized skewness and 
standardized kurtosis coefficients) are themselves normally distributed. Most 
other statistical packages print as options skewness and kurtosis coefficients but 
not their standard errors. However, these standard errors can be approximated 
manually (the standard error for skewness is approximately equal to the square 
root of 6 In, and the standard error for kurtosis is approximately equal to the 
square toot of 24 In, where n is the sample size). For both small and large 
sample sizes, rather than conducting a test of statistical significance, criteria can 
be used for assessing whether the standardized skewness and/or kurtosis 
coefficients are unacceptably large. One rule of thumb that we offer is that (a) 
standardized skewness and kurtosis coefficients which lie within ±2 suggest no 
serious departures from normality, (b) coefficients outside this range but within 
the ±3 boundary signify slight departures from normality, and (c) standardized 
coefficients outside the ±3 range indicate important departures from normality. 
Using such a rule provides an objective method of assessing normality that is 
based on effect sizes (i.e., standardized coefficients), (pp. 75) 

Consequences of not Meeting the Assumption 

Problems arise when a parametric statistic is used with data that are not normally 
distributed. Labovitz (1967) points out that “a word of caution is necessary. . .it 
frequently turns out that the violation of one assumption does not appreciably alter the 
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test, [although] the violation of two or more assumptions frequently does have a marked 
effect” (p.158). Thus, when the assumptions are not met, using a parametric statistic 
likely will generate invalid results (Field, 2000; Hinkle, Wiersma, & Jurs, 1998; Newton & 
Rudestam, 1999). 

The use of a parametric statistic when the assumption of normality is grossly 
violated can have serious consequences (Siegel, 1956). In fact, large skewness and 
kurtosis coefficients affect Type I and Type II error rates. For instance, a non-normal 
kurtosis coefficient typically produces an underestimate of the variance of a variable, 
which, in turn, increases the Type I error rate (Tabachnick & Fidell, 1996). Although the 
parametric f-test is typically robust with regard to Type I error under the conditions of 
large and equal samples sizes, this test is not powerful for when data are characterized 
by skewed distributions. In fact, under skewed conditions, the Wilcoxon Rank Sum test, 
a nonparametric counterpart of the f-test is three to four times more powerful (Blair & 
Higgins, 1980; Bridge & Sawilowsky, 1999; Nanna & Sawilowsky , 1998) — a finding of 
which researchers appear to be unaware. 

Advantages of Using Nonparametric Techniques 

There are many advantages of using nonparametric techniques. Siegel (1956) 
outlined six main advantages. The first advantage is that for most nonparametric 
statistics, the “accuracy of the probability statement does not depend on the shape of 
the population” (p. 32). Further, the size of the sample is not as important, because 
small sample sizes will not cause the results to be misleading to the extent that small 
samples unduly affect parametric tests. The third advantage is that nonparametric 
statistics can be used when observations come from several different populations. 
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Next, nonparametric statistics can be used with data that are ordinal, or ranked, as well 
as with interval- and ratio-scaled data. Nonparametric statistics can be used with 
nominal data as well. Finally, for many researchers, nonparametric statistics can be 
easily learned and applied, at least at the univariate level. Most statistical computer 
software packages, such as the Statistical Package for the Social Sciences (SPSS; 
SPSS Inc., 2001) and the Statistical Analysis System (SAS Institute Inc., 2002), include 
nonparametric statistics. 

McSeeney and Katz (1978) summarized the reasons for using nonparametric 
statistics. These include (a) nonparametric statistics have fewer assumptions, (b) 
nonparametric statistics can be used with rank-ordered data, (c) nonparametric 
statistics can be used with small samples, (d) data do not need to be normally 
distributed, and (e) outliers can be present. 

Hollander and Wolfe (1973) provided six reasons for using nonparametric 
statistics. Specifically, they contended that nonparametric methods (a) require few 
assumptions about the underlying population from which the data are collected; (b) do 
not necessitate the assumption of normality; (c) are often easier to apply than are their 
parametric counterparts; (d) are typically easy to understand; (e) are appropriate when 
parametric methods cannot be employed; and (f) are only slightly less efficient than 
parametric methods under normality, while being more efficient under non-normality. 

Further, when approximate normality is met, nonparametric tests are still 
relatively efficient-the asymptotic relative efficiency of nonparametric tests with respect 
to parametric tests can be as high as 95.5% (Gibbons, 1993; Hollander & Wolfe, 1973). 
Consequently, in many cases, researchers have relatively little to lose by using 
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nonparametric tests if the distribution is normal. If the distribution is not normal, tests 
based on nonparametric tests likely are more efficient than are their parametric 
counterparts. It is thus surprising that researchers do not utilize nonparametric tests 
more than they do. 

Use of Nonparametric Statistics in Published Journal Articles 

Many graduate-level statistics and research methodology courses in the past 
have not included extensive information regarding nonparametric statistics (Aiken et al., 
1990; Sawilowsky, 1990; Winn & Johnson, 1978). In fact, Mundfrom et al. (1998) found 
that the chi-square statistic was the only nonparametric statistic presented in 
introductory-level statistics and research methodology classes. Further, of the inferential 
statistics cited, the statistics and research methodology instructors indicated that the 
chi-square test was the fourth least most covered technique and was considered the 
fourth least important topic (Mundfrom, et al., 1998). Thus, nonparametric statistics 
infrequently appear in published articles, including those in the most reputable journals 
(Elmore & Woehlke, 1996; Jenkins et al., 1984). Moreover, many researchers do not 
report whether assumptions were checked, or whether the data fit the assumptions. For 
example, Keselman et al. (1998) reported that less than one-fifth of articles (i.e., 19.7%) 
"indicated some concern for distributional assumption violations” (p. 356). Similarly, 
Onwuegbuzie (in press) found that only 1 1 .1% of researchers discussed the extent to 
which analysis of variance, analysis of covariance, multivariate analysis of variance, or 
multivariate analysis of covariance were violated. 

To better understand this phenomenon, Royeen (1986) identified five published 
studies that used parametric statistics. For each study, the data were checked for 
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whether it met the assumptions for the parametric statistic used. In three of the five 
studies, the assumptions were not met. Next, the appropriate nonparametric statistic 
was computed on the data. For each of the three studies that did not meet the 
assumptions, there were large differences in the results yielded by the nonparametric 
statistic when compared with the published results from the parametric statistic. Thus, 
this examination demonstrates that if the assumptions are not met, the results can be 
very misleading. Furthermore, this examination exemplifies the problem that many 
studies have: if the assumptions have not been checked and they have not been met for 
the parametric statistics utilized, then the results are invalid. This is important to note 
when reading published studies that do not include information about whether or not the 
assumptions have been checked. 

A Typology of Nonparametric Statistics 

A myriad of nonparametric statistics exists for conducting distribution-free tests. 
The vast majority of these tests are readily available from the major statistical software 
(e.g., SPSS, SAS). A selection of some of the most common tests is provided in Table 
1 . 



Insert Table 1 about here 



Nonparametric Effect Size indices 

As stipulated by the current edition of the American Psychological Association 
(APA) Publication Manual (2001): 
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When reporting inferential statistics (e.g., t tests, F tests, and chi-square), include 
information about the obtained magnitude or value of the test statistic, the 
degrees of freedom, the probability of obtaining a value as extreme as or more 
extreme than the one obtained, and the direction of the effect. ... Neither of the 
two types of probability value directly reflects the magnitude of an effect or the 
strength of a relationship. For the reader to fully understand the importance of 
your findings, it is almost always necessary to include some index of effect size 
or strength of relationship in your Results section... The general principle to be 
followed, however, is to provide the reader not only with information about 
statistical significance but also with enough information to assess the magnitude 
of the observed effect or relationship, (pp. 22-26) 

Reporting effect sizes is no less important for statistically significant nonparametric 
findings than it is for statistically significant parametric results. However, when the few 
researchers who use nonparametric methods observe a statistically significant p-value, 
typically either they do not provide effect sizes, or they compute parametric-based effect 
sizes. First and foremost, statistically significant nonparametric statistics always should 
be followed up by some measure of effect size. However, it should be noted that just as 
parametric tests are adversely affected by departures from GLM assumptions, so too 
are parametric effect sizes (e.g., d, o 2 , e 2 ). For example, as noted by Onwuegbuzie and 
Levin (2002), parametric effect sizes are affected by non-normality and heterogeneity of 
scores. Thus, whatever assumptions were violated that led to the use of nonparametric 
methods also would distort the parametric effect size. In fact, Hogarty and Kromrey 
(2001), using Monte Carlo methods, demonstrated that the most frequently used effect- 
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size estimates (e.g., d) are extremely sensitive to departures from normality and 
homogeneity. Even trimmed effect-size measures (Hedges & Olkin, 1985; Yuen, 1974) 
exhibit extreme bias when the sample is small. 

Therefore, researchers should consider following up statistically significant 
nonparametric p-values with nonparametric effect sizes. Nonparametric effect sizes 
include Cramer’s V, the phi coefficient, and the odds ratio. These effect sizes indices, 
which are appropriate for chi-square analyses, are readily available on SPSS and SAS. 
Other nonparametric effect size estimates include (a) yi (Kraemer & Andrews, 1982), 
which is based on the degree of overlap between samples; (b) the Common Language 
(CL) effect-size statistic (McGraw & Wong, 1992), which indicates the relative frequency 
with which a score sampled from one distribution is greater than a score sampled from a 
second distribution; (c) Vargha and Delaney’s (2000) A, which is a measure of 
stochastic superiority that is appropriate for ordinally scaled distributions; (d) Cliffs 
(1993) d, appropriate for comparing two groups, which assesses the equivalence of 
probabilities of scores in each group being larger than scores in the other group (i.e., 
dominance); and (e) Wilcox and Muska’s (1999) W, a nonparametric analogue of of, 
which estimates the degree of certainty with which an observation can be linked to one 
population rather than the other. Of these five measures, Cliffs d and Vargha and 
Delaney’s A appear to be the most robust to violations of normality and heterogeneity of 
variance (Hogarty & Kromrey, 2001). Unfortunately, none of these five nonparametric 
measures are computed by the major statistical software programs. Thus, software 
development companies can play an important role here in motivating researchers to 
follow up statistically significant nonparametric statistics with nonparametric effect sizes. 
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Conclusions 

For the last 50 years, nonparametric techniques have been underutilized, despite 
the fact that statistical software routinely allows the computation of an array of 
nonparametric statistics, and despite the fact that parametric techniques are extremely 
sensitive to extreme violations to GLM assumptions. Unfortunately, many researchers 
are not being made adequately aware that nonparametric statistics provide viable 
alternatives to their parametric counterparts. Clearly, instructors, journal editors, and 
statistical software developers can play vital roles in promoting the nonparametric 
movement. In any case, much more work is needed to promote the use of distribution- 
free statistics. As such, we hope that the present call for the use of nonparametric 
techniques represents one small step in the right direction. 
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Table 1: Typology of Nonparametric Statistics 



Method 


Test 


Measures of Association: 

Spearman’s Rank Correlation Coefficient 
Kendall’s Rank Correlation Coefficient 
Chi-square Test of Independence 
Tau 

Theil test 
Cochran Test 
Fisher’s exact test 


consistency 

consistency 

concordance/discordance 

consistency 

slope of regression line 

consistency 

relationships 


Single Population Tests : 

Binomial 

Kolmogorov-Smirnov Goodness-of-Fit Test 
Sign Test 

Wilcoxon Signed Rank Test 
Gupta test 

Hodges-Lehman One-sample Estimator 


proportions 

goodness of fit test for continuous data 
paired replicates 

symmetry and equality of location 

symmetry 

median 


Comparison of Two Populations: 
Chi-square Test of Flomogeneity 
Wilcoxon (Mann-Whitney) Test 
Kolmogorov-Smirnov Two-Sample Test 
Rosenbaum’s Test 
Tu key’s Test 

Hodges-Lehman Two-sample Estimator 
Savage Test 
Ansari-Bradley Test 
Moses Confidence Interval 


differences in proportions 

differences in location and spread 

differences between population distributions 

differences in location 

differences in spread 

difference in medians 

differences in spread when medians equal 

differences in dispersion 

differences in location 


Comparison of Several Populations: 
Kruskal-Wallis Test 
Friedman’s Test 
Terpstra-Jonckheere Test 
Page’s Test 

The Match Test for Ordered Alternatives 
Miller’s jackknife Test 
Hollander Test 


symmetry and equality of location 
symmetry and location (two-way data) 
medians equal vs. changing median 
ordered alternatives 
medians equal vs. medians ordered 
unknown squared ratio of scale differs from 1 
X and Y variables are interchangeable 
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